{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Fungal ITS QIIME analysis tutorial\n",
    "==================================\n",
    "\n",
    "In this tutorial we illustrate steps for analyzing fungal ITS amplicon data using the QIIME/UNITE reference OTUs (alpha version 12_11) to compare the composition of 9 soil communities using open-reference OTU picking. More recent ITS reference databases based on UNITE are available on the [QIIME resources page](http://qiime.org/home_static/dataFiles.html). The steps in this tutorial can be generalized to work with other marker genes, such as 18S.\n",
    "\n",
    "We recommend working through the [Illumina Overview Tutorial](http://qiime.org/tutorials/illumina_overview_tutorial.html) before working through this tutorial, as it provides more detailed annotation of the steps in a QIIME analysis. This tutorial is intended to highlight the differences that are necessary to work with a database other than QIIME's default reference database. For ITS, we won't build a phylogenetic tree and therefore use nonphylogenetic diversity metrics. Instructions are included for how to build a phylogenetic tree if you're sequencing a non-16S, phylogenetically-informative marker gene (e.g., 18S).\n",
    "\n",
    "First, we obtain the tutorial data and reference database:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!(wget ftp://ftp.microbio.me/qiime/tutorial_files/its-soils-tutorial.tgz || curl -O ftp://ftp.microbio.me/qiime/tutorial_files/its-soils-tutorial.tgz)\n",
    "!(wget ftp://ftp.microbio.me/qiime/tutorial_files/its_12_11_otus.tgz ||  curl -O ftp://ftp.microbio.me/qiime/tutorial_files/its_12_11_otus.tgz)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now unzip these files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!tar -xzf its-soils-tutorial.tgz\n",
    "!tar -xzf its_12_11_otus.tgz\n",
    "!gunzip ./its_12_11_otus/rep_set/97_otus.fasta.gz\n",
    "!gunzip ./its_12_11_otus/taxonomy/97_otu_taxonomy.txt.gz"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can then view the files in each of these direcories by passing the directory name to the ``FileLinks`` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "from IPython.display import FileLink, FileLinks\n",
    "FileLinks('its-soils-tutorial')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The ``params.txt`` file modifies some of the default parameters of this analysis. You can review those by clicking the link or by catting the file. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!cat its-soils-tutorial/params.txt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The parameters that differentiate ITS analysis from analysis of other amplicons are the two ``assign_taxonomy`` parameters, which are pointing to the reference collection that we just downloaded."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We're now ready to run the ``pick_open_reference_otus.py`` workflow. Discussion of these methods can be found in [Rideout et. al (2014)](https://peerj.com/articles/545/).\n",
    "\n",
    "Note that we pass `-r` to specify a non-default reference database. We're also passing `--suppress_align_and_tree` because we know that trees generated from ITS sequences are generally not phylogenetically informative."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!pick_open_reference_otus.py -i its-soils-tutorial/seqs.fna -r its_12_11_otus/rep_set/97_otus.fasta -o otus/ -p its-soils-tutorial/params.txt --suppress_align_and_tree"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Note:** If you would like to build a phylogenetic tree (e.g., if you're using a phylogentically-informative marker gene such as 18S instead of ITS), you should remove the ``--suppress_align_and_tree`` parameter from the above command and add the following lines to the parameters file:\n",
    "\n",
    "```\n",
    "align_seqs:template_fp <path to reference alignment>\n",
    "filter_alignment:suppress_lane_mask_filter True\n",
    "filter_alignment:entropy_threshold 0.10\n",
    "```\n",
    "\n",
    "After that completes (it will take a few minutes) we'll have the OTU table with taxonomy. You can review all of the files that are created by passing the path to the `index.html` file in the output directory to the `FileLink` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "FileLink('otus/index.html')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can then pass the OTU table to ``biom summarize-table`` to view a summary of the information in the OTU table."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!biom summarize-table -i otus/otu_table_mc2_w_tax.biom"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we run several core diversity analyses, including alpha/beta diversity and taxonomic summarization. We will use an even sampling depth of 353 based on the results of `biom summarize-table` above. Since we did not built a phylogenetic tree, we'll pass the `--nonphylogenetic_diversity` flag, which specifies to compute Bray-Curtis distances instead of UniFrac distances, and to use only nonphylogenetic alpha diversity metrics."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "!core_diversity_analyses.py -i otus/otu_table_mc2_w_tax.biom -o cdout/ -m its-soils-tutorial/map.txt -e 353 --nonphylogenetic_diversity"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You may see a warning issued above; this is safe to ignore.\n",
    "\n",
    "**Note:** If you built a phylogenetic tree, you should pass the path to that tree via `-t` and not pass ``--nonphylogenetic_diversity``.\n",
    "\n",
    "You can view the output of `core_diversity_analyses.py` using `FileLink`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "FileLink('cdout/index.html')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Precomputed results\n",
    "\n",
    "In case you're having trouble running the steps above, for example because of a broken QIIME installation, all of the output generated above has been precomputed. You can access this by running the cell below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "FileLinks(\"its-soils-tutorial/precomputed-output/\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
